ANOVA and
Regression: equivalent but not always equally useful
Advantages of regression (see also Keith, Mutiple Regression And Beyond 1E, p.17)
Regression easily accommodates both categorical and
continuous IVs, whereas using continuous IVs in ANOVA (i.e., Analysis of
Covariance "ANCOVA") is complicated and doesn't allow for potential interaction
between continuous and categorical IVs.
In regression the effects of multiple IVs are
interpretable even with four or more at once, whereas interpreting a four-way
(or more) factorial ANOVA becomes unwieldy due to the many interactions (which wouldn't
usually be included in the regression analysis).
Regression is equally appropriate for both
experimental (manipulated) and non-experimental variables. This is true of
ANOVA too and thus shouldn't count in favor of regression, though there is a
tendency to casually assign cause-and-effect interpretations to ANOVA more so
than to regression.
Regression naturally emphasizes effect size by
including estimates like R2, b, and β rather than just p-values,
and the linear model is made explicit. ANOVA has its effect size measures as
well but the proportion-of-variance-explained type and the
standardized-difference-between-means type are a little less intuitive and
therefore more open to misinterpetation.
Advantages of ANOVA
ANOVA is probably more familiar to most people -- historically,
but even today.
In ANOVA there's a simpler treatment of interaction
effects (which after all may be included as a main focus of the research) and
repeated measures effects, whereas regression would require mutiple additional columns
of variables for product vectors and dummy coded subject vectors.
ANOVA is able to easily accommodate different error
terms for different effects in complex factorial designs, whereas regression
will use the model residual MSresidual as the error term for every
effect or incremental R2 tested.
Interactions in ANOVA and regression
In ANOVA, interactions are all included and analyzed
by default, which is appropriate for experimental designs in which interactions
are expected, planned for, and made to happen by controlling experimental conditions;
though even then they're mostly not significant, especially higher order
interactions that aren't of theoretical interest.
In regression, interactions are only analyzed when
they're deliberately included in an analysis; it's appropriate to omit them in nonexperimental
field research measuring preexisting characteristics where interactions mostly
don't occur or typically have little theoretical interest when they do. When an
interaction is of interest, it's done in regression as a moderation analysis.
Reasons not to dichotomize a continuous variable to
make it a categorical variable
(see also Pedhazur, Multiple Regression In Behavioral Research
3E, pp.574-7)
Treating continuous variables as categorical is often
done only because the researcher is familiar with analyzing group differences
and less familiar with techniques for analyzing continuous varables.
It results in loss of information and accuracy, since precise
numerical measurements are thrown into a less precise batch of similar
measurements; e.g., the IQ scale is reduced to "low and high" or maybe
"low, medium, and high".
The "low-scorer" category actually ranges
from low to average, and the "high-scorers" range from average to
high, with nearly identical subjects in the middle of the range being labeled
arbitrarily differently -- but all subjects within each group have the same
category label and are considered to be different from all subjects in the
other group
Multiple splits to identify the highest and lowest
groups of scores result in loss of subjects from the excluded middle category
of scores.
Focusing on, say, just the top and bottom 10% of a
sample to identify the very high and very low scorers may wrongly imply
linearity in the whole data set -- e.g., if it implicitly assumes that an
intermediate group would have scores intermediate between the high and low
groups without considering that the low group could be an unrepresentative
drop-off in the tail of the distribution; it also throws away 80% of subjects.
Reducing the number of subjects of interest
exacerbates subject loss for a two-phase evaluation, when for instance only 60%
of subjects show up for the second phase; and group sizes may be initially
equal but will probably be unequal for that second phase.
High and low dichotomized score groups are not
randomly assigned to the high and low score categories, hence any conclusions
about their group differences on other variables are confounded with unknown
factors and can't be attributed to the group IV.
Grouping results in loss of degrees of freedom from the
error term: A continuous variable takes up 1 degree of freedom for the
numerator of the F ratio, while a group variable takes up [number of groups -
1] degrees of freedom, leaving fewer df for the denominator and reducing the
likelihood of attaining statistical significance.
Dividing scores into into three groups is better than
just two, and four is better than three, but with each additional group, more
df come out of the error term. Four groups take up 3 df, compared to the 1 df
for the continuous variable, so there are 2 fewer error df; 6 groups take up 5 df,
so there are 4 fewer error df. Fewer error df make it harder to attain
statistical significance.
Making groups out of a continuous IV is better only
when the IV-DV relationship is non-linear, since the F statistic for the IV group
variable includes both linear and nonlinear effects' variance. Regression
measures only linear IV-DV relationships, so if most of an IV's relationship to
the DV is nonlinear, ANOVA captures what regression would not. In regression,
however, a nonlinear IV-DV relationship may be captured by adding a quadratic
term (or even a cubic term) to the equation, especially if there are
theoretical grounds for predicting that relationship.